Multimodal deep learning has been used to predict clinical endpoints and diagnoses from clinical routine data. However, these models suffer from scaling issues: they have to learn pairwise interactions between each piece of information in each data type, thereby escalating model complexity beyond manageable scales. This has so far precluded a widespread use of multimodal deep learning. Here, we present a new technical approach of "learnable synergies", in which the model only selects relevant interactions between data modalities and keeps an "internal memory" of relevant data. Our approach is easily scalable and naturally adapts to multimodal data inputs from clinical routine. We demonstrate this approach on three large multimodal datasets from radiology and ophthalmology and show that it outperforms state-of-the-art models in clinically relevant diagnosis tasks. Our new approach is transferable and will allow the application of multimodal deep learning to a broad set of clinically relevant problems.
translated by 谷歌翻译
The success of Deep Learning applications critically depends on the quality and scale of the underlying training data. Generative adversarial networks (GANs) can generate arbitrary large datasets, but diversity and fidelity are limited, which has recently been addressed by denoising diffusion probabilistic models (DDPMs) whose superiority has been demonstrated on natural images. In this study, we propose Medfusion, a conditional latent DDPM for medical images. We compare our DDPM-based model against GAN-based models, which constitute the current state-of-the-art in the medical domain. Medfusion was trained and compared with (i) StyleGan-3 on n=101,442 images from the AIROGS challenge dataset to generate fundoscopies with and without glaucoma, (ii) ProGAN on n=191,027 from the CheXpert dataset to generate radiographs with and without cardiomegaly and (iii) wGAN on n=19,557 images from the CRCMS dataset to generate histopathological images with and without microsatellite stability. In the AIROGS, CRMCS, and CheXpert datasets, Medfusion achieved lower (=better) FID than the GANs (11.63 versus 20.43, 30.03 versus 49.26, and 17.28 versus 84.31). Also, fidelity (precision) and diversity (recall) were higher (=better) for Medfusion in all three datasets. Our study shows that DDPM are a superior alternative to GANs for image synthesis in the medical domain.
translated by 谷歌翻译
Recent advances in computer vision have shown promising results in image generation. Diffusion probabilistic models in particular have generated realistic images from textual input, as demonstrated by DALL-E 2, Imagen and Stable Diffusion. However, their use in medicine, where image data typically comprises three-dimensional volumes, has not been systematically evaluated. Synthetic images may play a crucial role in privacy preserving artificial intelligence and can also be used to augment small datasets. Here we show that diffusion probabilistic models can synthesize high quality medical imaging data, which we show for Magnetic Resonance Images (MRI) and Computed Tomography (CT) images. We provide quantitative measurements of their performance through a reader study with two medical experts who rated the quality of the synthesized images in three categories: Realistic image appearance, anatomical correctness and consistency between slices. Furthermore, we demonstrate that synthetic images can be used in a self-supervised pre-training and improve the performance of breast segmentation models when data is scarce (dice score 0.91 vs. 0.95 without vs. with synthetic data).
translated by 谷歌翻译
骨关节炎(OA)是影响全球人口大量比例的最常见的联合障碍,主要是老年人。尽管其个人和社会经济负担,但仍然无法可靠地预测OA的发病和进展。旨在填补这种诊断缺口,我们介绍了基于生成模型的无监督学习计划,以预测基于膝关节X线本的OA的未来发展。使用来自骨关节炎研究的纵向数据,我们探讨了潜在的时间轨迹,以预测患者未来的射线照片,达到八年的随访访问。我们的模型预测了对OA的进展的风险,并超越了其监督对应物,其投入由七位经验丰富的放射科医师提供。通过支持模型,灵敏度,特异性,阳性预测值和负预测值显着增加到42.1%至51.6%,从72.3%到88.6%,从28.4%到57.6%,83.9%至88.4%,分别在没有这种支撑的情况下,放射科医生仅比随机猜测更好地进行。尽管需要在训练阶段没有人为注释,但我们的预测模型可以提高对OA发作和进展的预测。
translated by 谷歌翻译
TMIC是一种应用程序发明家扩展,用于部署ML模型,以在教育环境中使用Google Tochable Machine开发的图像分类。 Google Thotable Machine是一种直观的视觉工具,可为开发用于图像分类的ML模型提供面向工作流的支持。针对使用Google Tochable Machine开发的模型的使用,扩展TMIC可以作为App Inventor的一部分,以tensorflow.js为tensorflow.js导出的受过训练的模型,这是最受欢迎的基于块的编程环境之一,用于教学计算计算K-12。该扩展名是使用基于扩展图片的App Inventor扩展框架创建的,可在BSD 3许可下获得。它可用于在K-12中,在高等教育的入门课程中或有兴趣创建具有图像分类的智能应用程序的任何人。扩展TMIC是由Initiative Computa \ c {C} \ 〜Ao Na Escola的信息学和统计系的圣卡塔纳纳大学/巴西大学提供的研究工作的一部分,旨在在K-中引入AI教育。 12。
translated by 谷歌翻译
我们呈现梯度-SDF,这是三维几何形象的新颖表示,这些表达结合了暗示和显式表示的优势。通过在符号距离字段以及其梯度向量字段中存储每个体素以及其梯度矢量字段,我们通过最初配制的显式表面的方法增强隐式表示的能力。作为具体示例,我们示出了(1)梯度-SDF允许我们使用像哈希映射等有效存储方案的深度图像执行直接SDF跟踪,并且(2)梯度-SDF表示使我们能够执行光度束调节直接在Voxel表示中(不转换为点云或网格),自然地是几何和相机的完全隐含的优化,易于几何上采样。实验结果证实,这导致重建明显更敏锐。由于仍然遵守整体SDF体素结构,所提出的梯度-SDF同样适用于(GPU)并行化作为相关方法。
translated by 谷歌翻译